AI infrastructure that
developers love

Run inference, training, and batch processing with sub-second cold starts, instant autoscaling, and a developer experience that feels local.

Get Started

PROGRAMMABLE INFRA

BUILT FOR PERFORMANCE

ELASTIC GPU SCALING

UNIFIED OBSERVABILITY

Why Modal

Designed to help AI teams
deploy faster.

Programmable infra

Define everything in code, no YAML or config files. Keep environment and hardware requirements in sync.

Built for performance

Launch and scale containers in seconds to keep feedback loops tight and latency low.

Elastic GPU scaling

Elastic GPU capacity and access to thousands of GPUs across clouds. No quotas or reservations. Scale back to zero when not in use.

Unified observability

Integrated logging and full visibility into every function, container, and workload.

PRODUCTS

Powering any ML workload

PLATFORM

Build on a powerful foundation

From filesystem to runtime, every layer of Modal’s platform is engineered to give you the tools to build robust, scalable data applications.

Learn More

AI-native runtime

Engineered from the ground up for heavy AI workloads, built for super-fast autoscaling and model initialization and 100x faster than Docker.

Built-in storage layer

A globally distributed storage system built for high throughput and low latency. Designed for fast model loading, training data, or other datasets.

First-party integrations

Mount your existing cloud buckets, connect to MLOps tools, and send data to your existing telemetry vendors.

Multi-cloud capacity pool

Deep multi-cloud capacity with intelligent scheduling ensures you always have the CPUs and GPUs you need without managing input orchestration.

Built with Modal

All examples

Transcribe speech in batches with Whisper

Turn audio bytes into text at scale

Voice chat with LLMs

Build an interactive voice chat app

Transcribe speech with Kyutai STT

Stream transcripts at the speed of speech

Make music

Turn prompts into music with ACE-Step

Fast podcast transcriptions

Build an end-to-end podcast transcription app that leverages dozens of containers for super-fast processing

Fine-tune Whisper on domain vocab

Improve Whisper transcription accuracy on specialized vocabularies with fine-tuning

Deploy a TTS API with Chatterbox

Serve text-to-speech with Chatterbox to generate natural audio from text

Security and governance

Learn More

“We use Modal to run edge inference with <10ms overhead and batch jobs at large scale. Our team loves the platform for the power and flexibility it gives us.”

Brian Ichter, Co-founder

“Modal makes it easy to write code that runs on 100s of GPUs in parallel, transcribing podcasts in a fraction of the time.”

Mike Cohen, Head of Data

Read Case Study

“Everyone here loves Modal because it helps us move so much faster. We rely on it to handle massive spikes in volume for evals, RL environments, and MCP servers. Whenever a team asks about compute, we tell them to use Modal.”

Aakash Sabharwal, VP of Engineering

“We've previously managed to break services like GitHub because of our load, so Modal handling our massive scale so smoothly means a lot. We trust Modal to keep up with our growth, and we're excited to build together in the long term.”

Anton Osika, CEO & Founder

Read Case Study

Join Modal's developer community

Modal Community Slack

Igor Kotua

Engineer, The Linux Foundation

If you building AI stuff with Python and haven't tried @modal you are missing out big time

Caleb

ML Engineer, Hugging Face

Bullish on @modal - Great Docs + Examples - Healthy Free Plan (30$ free compute / month) - Never have to worry about infra / just Python

Daniel Rothenberg

Co-founder, Brightband

@modal continues to be magical... 10 minutes of effort and the `joblib`-based parallelism I use to test on my local machine can trivially scale out on the cloud. Makes life so easy!

@mattzcarey.com on blsky

AI Engineer, StackOne

@modal has got a bunch of stuff just worked out this should be how you deploy python apps. wow

Erin Boyle

ML Engineer, Tesla

This tool is awesome. So empowering to have your infra needs met with just a couple decorators. Good people, too!

Aman Kishore

Research Engineer, Harvey

If you are still using AWS Lambda instead of @modal you're not moving fast enough

Jai Chopra

Product, LanceDB

Recently built an app on Lambda and just started to use @modal, the difference is insane! Modal is amazing, virtually no cold start time, onboarding experience is great 🚀

Izzy Miller

DevRel, Hex

special shout out to @modal and @_hex_tech for providing the crucial infrastructure to run this! Modal is the coolest tool I’ve tried in a really long time— cannnot say enough good things.

Diego Fernandes

Co-founder & CTO, RocketSeat

Probably one of the best piece of software I'm using this year: modal.com

Mark Tenenholtz

Head of AI, PredeloHQ

I use @modal because it brings me joy. There isn't much more to it.

Adam Azzam

Product, Prefect

feels weird at this point to use anything else than @modal for this — absolutely the GOAT of dynamic sandboxes

Mark Neumann

Head of ML, Orbital Materials

Used @modal for the first time today - immediate "oh, this is how backends should work" moment, similar to using Vercel for the first time for frontend deployments.

Rémi 📎

Co-founder & CEO, .txt

Nothing beats @modal when it comes to deploying a quick POC

Moin Nadeem

Co-founder, Phonic

I've realized @modal is actually a great fit for ML training pipelines. If you're running model-based evals, why not just call a serverless Modal function and have it evaluate your model on a separate worker GPU? This makes evaluation during training really easy.

Matt Holden

Founder

Late to the party, but finally playing with @modal to run some backend jobs. DX is sooo nice (compared to Docker, Cloud Run, Lambda, etc). Just decorate a Python function and deploy. And it's fast! Love it.